author:
  - A
  - B
submission:
  - PMIR
  - B
year: "2024"
file: 
related: 
tags: []
review date: 2025-01-03

Summary

1-Line

Event 카메라 기반 객체 인식에서 심각한 도메인 갭 문제를 해결하기 위해 간단하고 효과적인 테스트 타임 적응 알고리즘인 Ev-TTA를 제안함 [p.1 Abstract]

3-Line

Event 카메라는 극한 환경에서도 동작할 수 있지만, 기존 알고리즘들은 심각한 도메인 시프트로 인해 성능이 크게 저하됨 [p.1 Abstract]

"While event cameras are proposed to provide measurements of scenes with fast motions or drastic illumination changes, many existing event-based recognition algorithms suffer from performance deterioration under extreme conditions due to significant domain shifts."

Ev-TTA는 이벤트의 시공간적 특성을 활용한 손실 함수를 통해 사전 학습된 분류기를 테스트 단계에서 미세 조정함 [p.1 Abstract]

"Ev-TTA mitigates the severe domain gaps by fine-tuning the pre-trained classifiers during the test phase using loss functions inspired by the spatio-temporal characteristics of events."

다양한 이벤트 표현 방식과 태스크에서 추가 학습 없이도 큰 성능 향상을 보임 [p.1 Abstract]

"Ev-TTA demonstrates a large amount of performance gain on a wide range of event-based object recognition tasks without extensive additional training."

5-Line

Event 카메라는 극한 환경에서도 동작할 수 있지만, 기존 알고리즘들은 심각한 도메인 시프트로 인해 성능이 크게 저하됨 [p.1 Abstract]

"While event cameras are proposed to provide measurements of scenes with fast motions or drastic illumination changes, many existing event-based recognition algorithms suffer from performance deterioration under extreme conditions due to significant domain shifts."

Ev-TTA는 이벤트의 시공간적 특성을 활용한 손실 함수를 통해 사전 학습된 분류기를 테스트 단계에서 미세 조정함 [p.1 Abstract]

"Ev-TTA mitigates the severe domain gaps by fine-tuning the pre-trained classifiers during the test phase using loss functions inspired by the spatio-temporal characteristics of events."

다양한 이벤트 표현 방식과 태스크에서 추가 학습 없이도 큰 성능 향상을 보임 [p.1 Abstract]

"Ev-TTA demonstrates a large amount of performance gain on a wide range of event-based object recognition tasks without extensive additional training."```

이벤트의 시간적 특성을 활용하여 인접한 이벤트들에 대해 유사한 예측을 강제하고, 극한 조명 상황에서는 두 극성 간의 공간적 상관관계를 활용해 노이즈를 처리함 [p.1 Abstract]

"Since the event data is a temporal stream of measurements, our loss function enforces similar predictions for adjacent events to quickly adapt to the changed environment online. Also, we utilize the spatial correlations between two polarities of events to handle noise under extreme illumination."

입력 표현 방식에 관계없이 적용 가능하며 회귀 태스크로도 확장 가능함 [p.1 Abstract]

"Our formulation can be successfully applied regardless of input representations and further extended into regression tasks."

Paper Review

Abstract / Motivation

Event 카메라는 기존 프레임 기반 카메라의 한계를 극복할 수 있는 잠재력이 있지만, 데이터 획득과 인식 사이에 명확한 간극이 존재함 [p.1 Introduction]

"Being able to acquire visual information in challenging environments, event cameras have the potential to overcome the limitations of frame-based cameras."

"Despite the myriad of benefits that event cameras can offer, there is a clear gap between data acquisition and recognition."

특히 극한 환경에서 획득된 이벤트들은 일반적으로 노이즈가 많고 시각적 특징이 부족함 [p.1 Introduction]

"While event cameras can acquire meaningful information even in challenging environments, events obtained from these conditions are typically noisy and lack visual features."

Background

Event 카메라는 고동적 범위와 마이크로초 단위의 시간 해상도로 밝기 변화를 측정하는 뉴로모픽 센서임 [p.1 Introduction]

"Event cameras are neuromorphic sensors that produce a sequence of brightness changes with high dynamic range and microsecond-scale temporal resolution."

기존의 이벤트 기반 인식 알고리즘들은 다음과 같은 방식으로 구성됨 [p.3 Method]

이벤트들을 이미지와 유사한 표현으로 집계
기존 이미지 분류기 아키텍처를 사용해 클래스 확률 출력

"The classification algorithms are composed of a two-step procedure, where events are first aggregated to form an image-like representation, and further processed with conventional image classifier architectures to output class probabilities."

Targeting Problems

극한 환경(저조도, 극단적 움직임)에서의 성능 저하 [p.1 Introduction]

"Event-based object recognition algorithms are directly affected by these changes in input and the performance becomes very unstable."

다양한 외부 환경에서의 레이블된 데이터 수집의 어려움 [p.1 Introduction]

"Since it is difficult to manually collect labeled data in a wide variety of external conditions, an adaptation strategy is necessary to fully leverage the potential of event cameras."

저조도 조건에서의 심각한 노이즈 문제 [p.5 Section 3.2]

"The low light condition significantly deteriorates event-based vision algorithms... The severe noise in the extreme lighting condition is beyond the range of adversaries that previous approaches can handle."

Suggestions / Methods

1. 시간적 일관성을 위한 학습 목적 함수 [p.3-4 Section 3.1]

(a) 예측 유사성 손실(Prediction Similarity Loss)

시간적으로 인접한 이벤트들의 예측 레이블 분포가 유사하도록 강제함 [p.4 Section 3.1]
- "Prediction similarity loss enforces the predicted label distributions for the temporally neighboring events E1,...,EK to be similar"
대칭적 KL 발산을 사용하여 첫 번째 이벤트 슬라이스와 나머지 슬라이스 간의 불일치를 최소화 [p.4 Section 3.1]
- "Using the symmetric KL divergence SKL(P,Q) = DKL(P||Q) + DKL(Q||P), prediction similarity loss is defined as..."
계산 효율성을 위해 모든 쌍을 비교하지 않고 첫 번째 슬라이스를 앵커로 사용 [p.4 Section 3.1]
- "Since the extensive pair-wise comparison would lead to a quadratic increase in computation, we instead use the first event slice as an anchor"

(b) 선택적 엔트로피 손실(Selective Entropy Loss)

첫 번째 이벤트 슬라이스의 예측이 다른 슬라이스들과 일관될 때만 엔트로피를 최소화 [p.4 Section 3.1]
- "we propose to selectively minimize the prediction entropy of the first event slice E1 ⊂ E only if the prediction is consistent with other event slices"
투표 메커니즘을 통해 일관성 판단: 각 슬라이스가 가장 높은 확률의 클래스에 투표 [p.4 Section 3.1]
- "each event slice Ei casts a vote on the class label with the highest probability"
SENTRY와 달리 불일치 샘플의 엔트로피는 무시 [p.4-5 Section 3.1]
- "while SENTRY proposes to maximize the predicted entropy for samples that are inconsistent, we find that simply ignoring these samples is more effective"

2. 공간적 일관성을 활용한 조건부 노이즈 제거 [p.5 Section 3.2]

(a) 노이즈 버스트 탐지

양극성과 음극성 이벤트의 비율을 통계적으로 분석 [p.5 Section 3.2]
- "Let Npos, Nneg denote the number of pixels containing positive and negative events, respectively"
Geary의 변환을 사용하여 비율의 통계적 유의성 검정 [p.5 Section 3.2]
- "Assuming Npos, Nneg follow a Gaussian distribution, the following transformation to the ratio R = Npos/Nneg follows a standard Gaussian distribution"

(b) 선택적 노이즈 제거

극성 불균형이 큰 경우에만 노이즈 제거 적용 [p.5-6 Section 3.2]
- "The noise removal operation only takes place if there is a large imbalance in the ratio of positive and negative events"
반대 극성의 공간적 이웃이 부족한 픽셀을 노이즈로 판단 [p.5 Section 3.2]
- "we denoise the channel with noise burst if a pixel containing events lack spatial neighbors in the opposite polarity"

3. 최적화 전략 [p.4 Section 3.1]

(a) 배치 정규화 레이어 최적화

전체 네트워크가 아닌 배치 정규화 레이어만 최적화 [p.4 Section 3.1]
- "We constrain the optimization to only operate on the batch normalization layers of the pre-trained classifier"
타겟 도메인 데이터가 부족할 때 과적합 방지 [p.4 Section 3.1]
- "When the target domain data is scarce, altering the entire set of parameters may divert the model from essential priors obtained from the pre-training"

(b) 온라인/오프라인 적응

오프라인: 전체 타겟 도메인에 대해 먼저 최적화 후 평가 [p.3 Section 3.1]
- "In the offline setup, Ev-TTA is first optimized for the entire target domain, and subsequently performs another set of inferences for evaluation"
온라인: 평가와 최적화를 동시에 수행 [p.3 Section 3.1]
- "In the online setup, Ev-TTA is simultaneously evaluated and optimized, thus omitting the second inference phase"